feat(queue): add Publisher.PublishAfter delay primitive by behinddwalls · Pull Request #176 · uber/submitqueue

behinddwalls · 2026-06-02T18:02:10Z

Summary

Why?

The orchestrator's build poll loop needs to space Status calls out
without consuming retry_count / DLQ retry slots. Delivery.Nack does
schedule redelivery after a delay, but it overloads retry_count —
which the operator relies on to flag real failures. The fix is a
separate "postpone this work" primitive, distinct from "this delivery
failed, try again", so both signals stay meaningful.

What?

Adds Publisher.PublishAfter(ctx, topic, msg, delayMs) to the queue
extension: a fresh message inserted into the topic, made visible to
subscribers only after delayMs from now. Distinct from
Delivery.Nack(requeueAfterMillis):

Nack is "this delivery failed, try again" — it bumps retry_count
and eventually trips DLQ.
PublishAfter is "postpone this work" — retry_count resets to 0,
DLQ stays available for true failures.

SQL backing in extension/queue/mysql:

New visible_after BIGINT UNSIGNED NOT NULL DEFAULT 0 column on
queue_messages. Default 0 means immediately visible — back-compat
for any existing rows.
messageStore.InsertDelayed(ctx, topic, messages, visibleAfterMs) is
the underlying primitive. Insert is now a thin wrapper that passes
visibleAfterMs = 0.
FetchByOffset gains a nowMs parameter and an
AND visible_after <= ? predicate so subscribers skip rows whose
delivery is still deferred.
MoveToDLQ writes visible_after = 0 explicitly: any delay on the
original message has already been consumed by the time it failed.

The first consumer of PublishAfter is the orchestrator's poll-driven
buildsignal loop, which lands in the stacked PR on top of this one.

Test Plan

✅ bazel test //extension/queue/... — all pass, including new
coverage for PublishAfter (positive / zero / negative delay,
closed-publisher) and for FetchByOffset skipping deferred rows via
the new visible_after predicate.
✅ make fmt lint check-tidy check-gazelle check-mocks — clean.
✅ make build — all targets compile.

Issues

Stack

## Summary ### Why? The build stage needs a vendor-agnostic abstraction for talking to a Build Runner — one that fits the existing extension family (storage, queue, conflict) and that does not have to change when the first real backend lands. Design rationale lives in `doc/rfc/build-runner.md`. ### What? Introduces the `BuildRunner` contract — three verbs, all keyed by an `entity.BuildID`: `Trigger` submits a build (ordered `base` + `head` change sets plus a free-form metadata map) and returns its ID; `Status` fetches the current `BuildStatus` and runner-defined metadata; `Cancel` requests cancellation. See `extension/buildrunner/build_runner.go` for the signatures. Highlights: - `Trigger` takes ordered `base` (assumed-good prefix from dependency batches) and `head` (changes being verified) — a runner can cache or short-circuit a prefix it has validated before, and attribute terminal failures to base vs head via `BuildMetadata`. - Build identifiers are the typed `entity.BuildID` (a lightweight `{ID string}` value, also the queue payload envelope) rather than a bare `string`, so the runner contract and the on-queue messages share one identifier type and the compiler distinguishes a build ID from other string IDs. - `metadata` is a free-form `map[string]string` for caller-supplied attributes (requester, ticket ID, trace ID) the runner MAY persist or echo back via `Status`. - `Trigger` and `Cancel` return promptly; `Status` MAY be lengthy. - The `BuildStatus` enum is narrowed to `Accepted` / `Running` / `Succeeded` / `Failed` / `Cancelled` so every backend can map its native lifecycle without leaking runner-specific stages. - `BuildMetadata` is added as a free-form map for round-tripping runner-defined attributes (build URL, duration, SHA, etc.). Naming: the interface is `BuildRunner` (not `BuildManager` — the "Manager" suffix doesn't describe what it does, and "Build" alone would collide cognitively with `entity.Build`). The package is `extension/buildrunner` to follow the repo convention used by `mergechecker`, `changeprovider`, `pusher`, `counter`, and `storage`. Implementation (noop backend, queue `PublishAfter` primitive, controller wiring, and the poll-driven `buildsignal` loop) lands in the follow-up PRs stacked on top of this one. ## Test Plan - ✅ `make test` — entity/build tests pass; mock-only consumers of the interface compile and pass. - ✅ `make fmt lint check-tidy check-gazelle check-mocks` — clean. - ✅ `make build` — all targets compile.

## Summary ### Why? Wiring tests for the orchestrator's build stage need a `BuildRunner` implementation, but real backends (BuildKite, Jenkins, etc.) won't land until later. A noop that immediately reports success is also useful as the local default in `make local-start` and as a best-case baseline where every build passes. ### What? Adds `extension/buildrunner/noop` — a `BuildRunner` that does no real work. Every `Trigger` returns a unique `noop-<n>` build ID with no error; every `Status` returns `BuildStatusSucceeded` with no metadata; `Cancel` is a no-op. The atomic counter keeps IDs unique under concurrent use. ## Test Plan - ✅ `make test` — 36 unit tests pass, including the new `extension/buildrunner/noop:noop_test` (Trigger uniqueness, Status, Cancel, interface satisfaction). - ✅ `make fmt lint check-tidy check-gazelle check-mocks` — clean. - ✅ `make build` — all targets compile.

## Summary ### Why? The orchestrator's build poll loop needs to space `Status` calls out without consuming `retry_count` / DLQ retry slots. `Delivery.Nack` does schedule redelivery after a delay, but it overloads `retry_count` — which the operator relies on to flag real failures. The fix is a separate "postpone this work" primitive, distinct from "this delivery failed, try again", so both signals stay meaningful. ### What? Adds `Publisher.PublishAfter(ctx, topic, msg, delayMs)` to the queue extension: a fresh message inserted into the topic, made visible to subscribers only after `delayMs` from now. Distinct from `Delivery.Nack(requeueAfterMillis)`: - `Nack` is "this delivery failed, try again" — it bumps `retry_count` and eventually trips DLQ. - `PublishAfter` is "postpone this work" — `retry_count` resets to 0, DLQ stays available for true failures. SQL backing in `extension/queue/mysql`: - New `visible_after BIGINT UNSIGNED NOT NULL DEFAULT 0` column on `queue_messages`. Default 0 means immediately visible — back-compat for any existing rows. - `messageStore.InsertDelayed(ctx, topic, messages, visibleAfterMs)` is the underlying primitive. `Insert` is now a thin wrapper that passes `visibleAfterMs = 0`. - `FetchByOffset` gains a `nowMs` parameter and an `AND visible_after <= ?` predicate so subscribers skip rows whose delivery is still deferred. - `MoveToDLQ` writes `visible_after = 0` explicitly: any delay on the original message has already been consumed by the time it failed. The first consumer of `PublishAfter` is the orchestrator's poll-driven `buildsignal` loop, which lands in the stacked PR on top of this one. ## Test Plan - ✅ `bazel test //extension/queue/...` — all pass, including new coverage for `PublishAfter` (positive / zero / negative delay, closed-publisher) and for `FetchByOffset` skipping deferred rows via the new `visible_after` predicate. - ✅ `make fmt lint check-tidy check-gazelle check-mocks` — clean. - ✅ `make build` — all targets compile.

) ## Summary ### Why? The build stage needs a vendor-agnostic abstraction for talking to a Build Runner — one that fits the existing extension family (storage, queue, conflict) and that does not have to change when the first real backend lands. Design rationale lives in `doc/rfc/build-runner.md`. ### What? Introduces the `BuildRunner` contract — three verbs, all keyed by an `entity.BuildID`: `Trigger` submits a build (ordered `base` + `head` change sets plus a free-form metadata map) and returns its ID; `Status` fetches the current `BuildStatus` and runner-defined metadata; `Cancel` requests cancellation. See `extension/buildrunner/build_runner.go` for the signatures. Highlights: - `Trigger` takes ordered `base` (assumed-good prefix from dependency batches) and `head` (changes being verified) — a runner can cache or short-circuit a prefix it has validated before, and attribute terminal failures to base vs head via `BuildMetadata`. - Build identifiers are the typed `entity.BuildID` (a lightweight `{ID string}` value, also the queue payload envelope) rather than a bare `string`, so the runner contract and the on-queue messages share one identifier type and the compiler distinguishes a build ID from other string IDs. - `metadata` is a free-form `map[string]string` for caller-supplied attributes (requester, ticket ID, trace ID) the runner MAY persist or echo back via `Status`. - `Trigger` and `Cancel` return promptly; `Status` MAY be lengthy. - The `BuildStatus` enum is narrowed to `Accepted` / `Running` / `Succeeded` / `Failed` / `Cancelled` so every backend can map its native lifecycle without leaking runner-specific stages. - `BuildMetadata` is added as a free-form map for round-tripping runner-defined attributes (build URL, duration, SHA, etc.). Naming: the interface is `BuildRunner` (not `BuildManager` — the "Manager" suffix doesn't describe what it does, and "Build" alone would collide cognitively with `entity.Build`). The package is `extension/buildrunner` to follow the repo convention used by `mergechecker`, `changeprovider`, `pusher`, `counter`, and `storage`. Implementation (noop backend, queue `PublishAfter` primitive, controller wiring, and the poll-driven `buildsignal` loop) lands in the follow-up PRs stacked on top of this one. ## Test Plan - ✅ `make test` — entity/build tests pass; mock-only consumers of the interface compile and pass. - ✅ `make fmt lint check-tidy check-gazelle check-mocks` — clean. - ✅ `make build` — all targets compile. ## Stack 1. @ #175 1. #179 1. #176 1. #177

## Summary ### Why? Wiring tests for the orchestrator's build stage need a `BuildRunner` implementation, but real backends (BuildKite, Jenkins, etc.) won't land until later. A noop that immediately reports success is also useful as the local default in `make local-start` and as a best-case baseline where every build passes. ### What? Adds `extension/buildrunner/noop` — a `BuildRunner` that does no real work. Every `Trigger` returns a unique `noop-<n>` build ID with no error; every `Status` returns `BuildStatusSucceeded` with no metadata; `Cancel` is a no-op. The atomic counter keeps IDs unique under concurrent use. ## Test Plan - ✅ `make test` — 36 unit tests pass, including the new `extension/buildrunner/noop:noop_test` (Trigger uniqueness, Status, Cancel, interface satisfaction). - ✅ `make fmt lint check-tidy check-gazelle check-mocks` — clean. - ✅ `make build` — all targets compile. ## Issues ## Stack 1. #175 1. @ #179 1. #176 1. #177

behinddwalls mentioned this pull request Jun 2, 2026

feat(buildrunner): add vendor-agnostic BuildRunner interface + RFC #175

Merged

behinddwalls force-pushed the preetam/queue-publish-after branch from b8ff9e8 to 42a34ea Compare June 2, 2026 18:09

behinddwalls force-pushed the preetam/build-mgr-interface branch from 293db26 to 9e9cf91 Compare June 2, 2026 18:09

behinddwalls mentioned this pull request Jun 2, 2026

feat(buildrunner): noop impl, poll-driven buildsignal, pipeline wiring #177

Merged

behinddwalls marked this pull request as ready for review June 2, 2026 18:13

behinddwalls requested review from a team and sbalabanov as code owners June 2, 2026 18:13

behinddwalls force-pushed the preetam/build-mgr-interface branch from 9e9cf91 to dc4b4cb Compare June 2, 2026 18:28

behinddwalls force-pushed the preetam/queue-publish-after branch from 42a34ea to 2dd6d25 Compare June 2, 2026 18:28

behinddwalls force-pushed the preetam/build-mgr-interface branch from dc4b4cb to b6f1414 Compare June 2, 2026 18:47

behinddwalls force-pushed the preetam/queue-publish-after branch from 2dd6d25 to a7d0c2a Compare June 2, 2026 18:47

behinddwalls changed the base branch from preetam/build-mgr-interface to preetam/buildrunner-noop June 2, 2026 18:53

behinddwalls mentioned this pull request Jun 2, 2026

feat(buildrunner/noop): add pass-through BuildRunner stub #179

Merged

behinddwalls force-pushed the preetam/queue-publish-after branch from a7d0c2a to d83c3a5 Compare June 2, 2026 18:57

behinddwalls force-pushed the preetam/buildrunner-noop branch 2 times, most recently from a7226e7 to ff9bcc2 Compare June 2, 2026 22:04

behinddwalls force-pushed the preetam/queue-publish-after branch 2 times, most recently from d735e8f to c368626 Compare June 2, 2026 22:29

behinddwalls force-pushed the preetam/buildrunner-noop branch from ff9bcc2 to c4455b9 Compare June 2, 2026 22:29

behinddwalls force-pushed the preetam/queue-publish-after branch from c368626 to 64d1a6e Compare June 2, 2026 22:53

behinddwalls force-pushed the preetam/buildrunner-noop branch from c4455b9 to 6380a1a Compare June 2, 2026 22:53

behinddwalls added 3 commits June 2, 2026 16:07

behinddwalls force-pushed the preetam/buildrunner-noop branch from 6380a1a to 5bc2c0d Compare June 2, 2026 23:08

behinddwalls force-pushed the preetam/queue-publish-after branch from 64d1a6e to 1311592 Compare June 2, 2026 23:08

JamyDev approved these changes Jun 3, 2026

View reviewed changes

Base automatically changed from preetam/buildrunner-noop to main June 3, 2026 02:01

Merge branch 'main' into preetam/queue-publish-after

e8b3617

behinddwalls merged commit 191b93a into main Jun 3, 2026
13 checks passed

behinddwalls deleted the preetam/queue-publish-after branch June 3, 2026 02:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(queue): add Publisher.PublishAfter delay primitive#176

feat(queue): add Publisher.PublishAfter delay primitive#176
behinddwalls merged 4 commits into
mainfrom
preetam/queue-publish-after

behinddwalls commented Jun 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

behinddwalls commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why?

What?

Test Plan

Issues

Stack

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

behinddwalls commented Jun 2, 2026 •

edited

Loading